My first introduction to the R programming language.
R is a language I’ve been curious about for some time. I started learning a bit about Python when I kept reading and hearing about R. Python and R are the two programming languages most commonly used in data science. Out of curiosity and a bit of boredom, I decided to learn a bit of R syntax. After that experience, I decided to continue learning R. I decided that I would learn more R using DataQuest’s Data Analyst in R track.
This is the first of a series of posts documenting what I’m learning in the Data Analyst in R track. In this post, I’ll discuss the first lesson of the track, Introduction to Programming in R.
Much like the JavaScript and Python lessons I’ve had prior, this lesson started off with basic math and order of operations. Like those languages, R uses the order of operations when evaluating expressions.
217 – 9
415 + 156
7 * 18
144/4
(45 – 3)/6
(17 * 8) – 4
After calculations like this, I moved on to variables. This is where things got interesting. Although the creation of variables in R is similar to that of JavaScript and Python, one key difference is the assignment operator. Where other programming languages use the equal sign(=), R uses an arrow (<-) to assign values. Let’s say I wanted to create some variables using the names of teas. It would look like this :
chai = 5
matcha = 4
black = 2
green = 2
white = 3
chai <- 5
matcha <- 4
black <- 2
green <- 2
white <- 3
As with other languages, when naming variables in R, there are rules to follow. Variables can contain numbers, letters and underscores, special characters are not allowed and variable names cannot begin with a number.
What was new to me, however, is that in R a dot can be used in a variable name. I believe this is the first time I’ve come across this rule in programming. Variable names can begin with a dot but the dot cannot be followed by a number.
DataQuest provides a takeaway(I’ll discuss what this is in a bit) that gives us a link to this awesome resource from R-bloggers about variable naming conventions.
Next, I moved on to vectors. Vectors are storage objects that store a sequence of values. These values are assigned to a single variable. To create a vector you would write:
The c() is a function that stands for concatenate. It takes multiple values as input and stores these values as one variable to create the vector. You can also use variable names to create a vector as I did with the tea_flavors vector.
R has built-in functions. Some of them include:
mean() – average of values in vector
sum()– sum of values in vector
length()-total number of elements in vector
min()-smallest value in vector
max()-largest value in vector
These functions allow to quickly operate across all the values in the vector. For example, if I wanted to know the average of tea_prices, I would use the mean() function the average price of teas and store that value in a vector like so:
If I wanted to know how many elements were in tea_flavors, I would write:
I mentioned a bit earlier in the post about a takeaway. At the end of each lesson, DataQuest gives you a takeaway, a summary of what you learned in the lesson. This can be downloaded as a PDF and it’s great for reviewing concepts and syntax. It also gives you links to learn more about topics in the lesson. I can access my takeaways on my computer or my phone and I read them frequently. They have been really helpful for me as I forget the syntax rules.
And with that said, this wraps up intro to programming in R! I’ll get more into vectors in my next post.
For attribution, please cite this work as
Brantley (2020, Jan. 9). Data Sci Dani: Intro to R Programming. Retrieved from https://datascidani.com/posts/intro_to_R 01-08-20/
BibTeX citation
@misc{brantley2020intro, author = {Brantley, Danielle}, title = {Data Sci Dani: Intro to R Programming}, url = {https://datascidani.com/posts/intro_to_R 01-08-20/}, year = {2020} }